-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-53792][SS] Fix rocksdbPinnedBlocksMemoryUsage when bounded memory … #52527
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
...ore/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDBMemoryManager.scala
Outdated
Show resolved
Hide resolved
...ore/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDBMemoryManager.scala
Outdated
Show resolved
Hide resolved
...ore/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDBMemoryManager.scala
Outdated
Show resolved
Hide resolved
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDB.scala
Outdated
Show resolved
Hide resolved
sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/state/RocksDBSuite.scala
Outdated
Show resolved
Hide resolved
adc1a70
to
0375604
Compare
0375604
to
1ac1e8a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Just one final nit. Thanks
*/ | ||
def getInstancePinnedBlocksMemUsage( | ||
uniqueId: String, | ||
pinnedUsage: Long): Long = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: param name pinnedUsage
is different from what is in the description globalPinnedUsage
. Lets just name this param as: totalPinnedUsage
…usage is enabled
What changes were proposed in this pull request?
Changing the way we calculate
pinnedBlocksMemUsage
when an instance has bounded memory enabled.Before the change, we always report back
getDBProperty("rocksdb.block-cache-pinned-usage")
which returns that size of pinned block requested by an instance. This is not accurate when instances share the same cached, because instances might share the same pinned blockAfter this change, when isMemoryBounded is enabled for an instance, we call
lruCache.getPinnedUsage()
to get the total memory usage of SHARED pinned blocks, and divide the global usage with the number of IS_MEMORY_BOUNDED instances. This is because RocksDBMemoryManager will return the same cache for all the instance that has isMemoryBounded = trueUnit test:
build/mvn -Dtest=none -DwildcardSuites=org.apache.spark.sql.execution.streaming.state.RocksDBSuite test
Why are the changes needed?
See above for why this is a bug.
This fix prevents us from over-reporting
pinnedBlocksMemUsage
when isBoundedMemory is enabled for an instanceDoes this PR introduce any user-facing change?
No
How was this patch tested?
See added unit tests on the correctness of the calculation.
Was this patch authored or co-authored using generative AI tooling?
Yes. Generated-by cursor and 'claude-4-sonnet'